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Abstract. Let X be a matrix sampled uniformly from the set of doubly stochastic matri- 
0*4 ces of size nxn. We show that the empirical spectral distribution of the normalized matrix 

s/n(X — EX) converges almost surely to the circular law. This confirms a conjecture of 
Chatterjee, Diaconis and Sly. 

1 ^ 1. Introduction 

Let M be a matrix of size nxn and let Ai, . . . , X n be the eigenvalues of M. The empirical 
1 1 spectral distribution (ESD) \im of M is defined as 
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We also define u c i r as the uniform distribution over the unit disk, 

> lr \ 

cn Mcir(s, t) := -mes \z\ < 1; R(z) < s, Q(z) <t). 

00 

Resolving a long standing conjecture in random matrix theory, Tao and Vu (appendix by 
Krishnapur) have proved that the ESD of random i.i.d. matrices obeys the circular law. 

Theorem 1.1. [M] Assume that the entries of M are i.i.d. copies of a complex random 
variable of mean zero and variance one, then the ESD of the matrix converges almost 

surely to the circular measure /i c ir- 

X 

This result is built on earlier developments by Girko OUS], Bai pQ, Gotze-Tikhomirov [TH] , 
Pan-Zhou [26J and by many others. In view of universality phenomenon, it is of importance 
to study the law for random matrices of non-independent entries. Probably one of the 
first results in this direction is due to Bordenave, Caputo and Chafai [6] who proved the 
following. 

Theorem 1.2. [6, Theorem 1.3] Let X be a random matrix of size nxn whose entries 
are i.i.d. copies of a non-negative continuous random variable with finite variance a 2 and 
bounded density function. Then with probability one the ESD of the normalized matrix 
y/nX , where X = (^ij)i<ij<n and Xij := Xij/(xn + • • • + Xi n ), converges weakly to the 
circular measure /x c ir- 
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In particular, when x\\ follows the exponential law of mean one, Theorem 1.2 establishes 
the circular law for the Dirichlet Markov ensemble (see also [7]). 

Related results with "linear" assumption of independence include a result of Tao, who 
among other things proves the circular law for random zero-sum matrices. 

Theorem 1.3. |3U| Theorem 1.13] Let X be a random matrix of size nx n whose entries 
are i.i.d. copies of a random variable of mean zero and variance one. Then the ESD of 
the normalized matrix -^X, where X = (xij)i<i,j<n o,nd xiij := xij — ~(xn + • • • + Xi n ), 
converges almost surely to the circular measure jJL c \ r . 



With a slightly different assumption of dependence, Vu and the current author showed in 
[25] the following. 

Theorem 1.4. [25} Theorem 1.2] Let < e < 1 be a positive constant. Let M n be a random 
(—1,1) matrix of size n x n whose rows are independent vectors of given row-sum s with 
some s satisfying \s\ < (1 — e)n. Then the ESD of the normalized matrix -^j^M n , where 

a 2 = 1 — (^) 2 , converges almost surely to the distribution n c \ r as n tends to oo. 



To some extent, the matrix model in Theorem 1.4 is a discrete version of the random Markov 



matrices considered in Theorem 1.2 where the entries are now restricted to ±l/s. However, 
it is probably more suitable to compare this model with that of random Bernoulli matrices. 
By Theorem |1.1[ the ESD of the normalized random Bernoulli matrices obeys the circular 



law, and hence Theorem 1.4 serves as a local version of the law. 



Although the entries of the matrices above are mildly correlated, the rows are still inde- 
pendent. This allows sufficient room so that we can adapt the existing approaches to bear 
with the problems. Our focus in this note is on a matrix model whose rows and columns 
are not independent. 

Theorem 1.5 (Circular law for random doubly stochastic matrices). Let X be a matrix 
chosen uniformly from the set of doubly stochastic matrices. Then the ESD of the normalized 
matrix \fn{X — ~EX) converges almost surely to fi c ir- 



Little is known about the properties of random doubly stochastic matrices as it falls outside 
the scope of techniques from the usual random matrix theory. However, there have been 
recent breakthrough by Barvinok and Hartigan (see for instance [3 HI E]). The Birkhoff 
polytope M n , which is the set of doubly stochastic matrices of size n x n, is the basic 
object in operation research because of its appearance as the feasible set for the assignment 
problem. Doubly stochastic matrices also serve as a natural model for priors in statistical 
analysis of Markov chains. There is a close connection between the Birkhoff polytope and 
MS(n, c), the set of matrices of size nx n with non-negative integer entries and all column 
sums and row sums equal c. These matrices are called magic squares, which are well known 
in enumerative combinatorics. We refer the reader to the work of Chatterjee, Diaconis and 
Sly [8] for further discussion. 

There is a strong belief that random doubly stochastic matrices behave like i.i.d. random 
matrices. This intuition has been verified in [8] in many ways. Among other things, it has 
been shown that the normalized entry nx\\ converges in total variation to an exponential 
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random variable of mean one. More general, the authors of [8] showed that the normalized 
projection nXk, where Xk is the submatrix generated by the first k rows and columns 
of X and where k = O(j^p^), converges in total variation to the matrix of independent 
exponential random variables. 

Regarding spectral distribution of X , it has been shown by Chatterjee, Diaconis and Sly that 
the empirical distribution of the singular values of y/n(X — EX) obey the quarter-circular 
law. 

Theorem 1.6. [8, Theorem 3] Let < o~i,...,o~ n be the singular values of y/n(X — 
EX), where X is a random doubly stochastic matrix. Then the empirical spectral mea- 
sure i J2i<n converges in probability and in weak topology to the quarter- circle measure 
£V4- x 2 l [0>2 ]dx. 



The key ingredients in the proof of Theorem 1.6 are a sharp concentration result coupled 



with two transference principles (Lemmas 2.2 and 2.3 below). These principles help translate 



results from i.i.d random matrices of independent random exponential variables to random 
doubly stochastic matrices. 



It has been conjectured in [8j that the empirical spectral distribution of y/n{X — EX) obeys 
the circular law, which we confirm now. For the rest of this section we sketch the general 



plan to attack Theorem 1.5 



For the entries of X are exchangeable, EX is the matrix J n of all 1/n. The matrix X — EX 
has a zero eigenvalue and we want to single this outlier out due to several technical reasons. 
One way to do this is passing to X, a matrix of size (n — 1) x (n — 1) defined as 



X 



(x 2 2-X21 ••• X 2n -X2l\ 
X32 ~ X 3 i • • • X 3n - X31 



\X n 2 X n \ 



Xnn X n 



1/ 



It is not hard to show that the spectra of \fn{X — EX) is the union of zero and the spectra 
of \fnX. Indeed, consider the matrix \I n — \/n(X — EX). By adding all other rows to its 
first row, and then subtracting the first column from every other column, we arrive at a 
matrix whose determinant is Adet(AJ n _i — \fnX\ thus confirming our observation. Hence, 
it is enough to prove the circular law for X. 

Theorem 1.7 (Main theorem). Let X be a matrix chosen uniformly from the set of doubly 
stochastic matrices. Then the ESD of the matrix y/nX converges almost surely to /^cir • 



One way to prove our main result above is to showing that the Stieltjes transform of f-i^x 
converges to that of the circular measure. However, it is slightly more convenient to work 
with the logarithmic potential. We will mainly rely on the following machinery from |34| 
Theorem 2.1]. 

Lemma 1.8. Suppose that M = (mj,)i<i j< n is a random matrix. Assume that 
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n = n Si j m % is bounded almost surely; 

for almost all complex numbers zq, the logarithmic potential ^ log | det(M — z$I n 
converges almost surely to f(zo) = J c log \w — zo\dfj, c i r (w) . 



Then \xm converges almost surely to jJL c { 



We will break the main task into two parts, one showing the boundedness and one proving 
the convergence. 

Theorem 1.9. Let X be a matrix chosen uniformly from the set of doubly stochastic ma- 
trices. Then the square sum X^2<i j<n( x ij ~ x n) 2 * s bounded almost surely. 



The proof of Theorem 1.9 will be presented at the end of Section [2] The heart of our paper 
is to establish the convergence of ~ log | det(y/nX — zo/ ra _i)|. 

Theorem 1.10. For almost all complex numbers zq, - log | det(^/nX — zoI n -i)\ converges 
almost surely to f(zo). 



The main difficulty in establishing Theorem 1.10 is that the entries in each row and each 



column of X are not at all independent. To our best knowledge, the convergence for such 
model has not been studied before in the literature. We will present its proof in Section |6| 

Notation. Here and later, asymptotic notations such as O, Q, 0, and so for, are used under 
the assumption that n —> oo. A notation such as Oc(0 emphasizes that the hidden constant 
in O depends on C. 

For a matrix M, we use the notation rj(M) and Cj(M) to denote its i-th row and j'-th 
column respectively. For an event A, we use the subscript P x (^4) to emphasize that the 
probability under consideration is taking according to the random vector x. 

For a real or complex vector v = (v\, . . . , v n ), we will use the shorthand ||v|| for its L2-norm 

(£,N 2 ) 1/2 . 



2. Some properties of random doubly stochastic matrices 



We will gather here some basic properties of random doubly stochastic matrices. The reader 
is invited to consult [8] for further insight and applications. 



2.1. Relation to random i.i.d matrix of exponentials. Let Ai n be the Birkhoff poly- 
tope generated by the permutation matrices. Let $ be the projection from R n to R^™" 1 ) 
by mapping (:%)i<i,j<n to (xij) 2 <i,j<n- 



Let T : R(™ — > R™ 2 denote the following function 
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' Xij 2 < i,j < n; 

r(X) = r(X) ir .= r-^ X * k 2<i<n,j = l; 

1 - J2k=2 x kj 2<j<n,i = l; 

k 1 - E"=2( 1 - Efc=2 x kl) i = j = l. 

Thus r extends a matrix X of size (n — 1) x (n — 1) to a doubly stochastic matrix of size 
n x n whose bottom right corner is X. With the above notation, the doubly stochastic 
matrices correspond to (n — 1) x (n — l)-matrices of the set 

S n := [X = (x ii ) 2 <i, i <n G [0, l] (n ~ 1)2 : < T{X) VJ < l} . 

The distribution of X as a random doubly stochastic matrix is then given by the uniform 
distribution on S n . We next introduce an asymptotic formula by Canfield and Mckay 
for the volume of S n 



Vo1 ^ = J^ ^nJ^r ^l + « 2 + 0(1)). (1) 

This formula plays a crucial role in the transference principles to be introduced next. 
Define 

D n := Iy = (j/tf)i<ij<n : G 5„,min{iyy - r($(V))y} > o| , 

where : R" 2 — > R^ n_1 ) 2 is the projection X = (xij)i<i,j<n *-> ( x ij)2<i,j<n- 

Let Y = (yij)i<ij< n be a random matrix where ytj are i.i.d. copies of a random exponential 
variable with mean one. As an application of ([!]), it is not hard to deduce the follow- 
ing transference principle between random doubly stochastic matrices X and random i.i.d 
matrices Y. 

Lemma 2.2. [81 Lemma 2.1] Condition on Y G D n , we have {^Uij)2<i,j<n is uniform on 
S n . Furthermore, for large n we have 



P(Y G D n ) > n 



-4n 



Lemma 



2.2 



is useful when we want to pass an extremely rare event from the model ^Y to 
the model X . In applications (in particular when working with concentration results), it is 
more useful to work with matrices of bounded entries. With this goal in mind we define 

Sn ■= U = (5 ij -)2<iJ<n G [0,1] ( ™- 1)2 ,0 < r(X)y < , 
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and 



D n : = jy = (£roOl<ni<n G [0, lOlogn]" 2 , G S n ,0 < ^ - r($(if))y < n" 4 | . 

Observe that S n corresponds to doubly stochastic matrices X of entries bounded by 10 log n/n. 

Let Y = (yij)i<i,j<n where yij are i.i.d. copies of a truncated exponetial y of the following 
density function 



Py(x) 



exp(-x)/(l - n~ 10 ) if x G [0, 10 log n], 

(2) 

otherwise. 



It is clear that E(y 2 ) = 9(1) and E(y 4 ) = 9(1). We now introduce another transference 
principle which is an analogue of Lemma |2.2| 



orm 



Lemma 2.3. [8, Lemma 4.1] Condition onY£ D n , we have that (-yij)2<i,j<n is unift 
on S n . Furthermore, for large n we have 

P(Y G D n ) > n- 10n . 

Notice that in the corresponding definition of D n in [8, Section 4] the bound 10 log n was 
replaced by 61ogn, but one can easily check that this modification does not affect the 



validity of Lemma 2.3 



2.4. Relation to random stochastic matrices. Let 1Z = 1Z rtn denote the r(n — 1)- 
dimensional polytope of nonnegative matrices of size r x n whose rows sum to 1. let \x r 
denote the uniform probability measure on 1Z and let v r denote the measure on 1Z induced 
by the first r rows of a random doubly stochastic matrix X. As another application of 
([!]) (to be more precise, we need a more general form for volume of polytopes generated 
by rectangular matrices of constant row and column sums), one can show that these two 
measures are comparable as long as r is small. 

Lemma 2.5. [SI Lemma 3.3] For a fixed integer r > 1 and n > r the Radon- Nikodym 
derivative of the measures \x T and v T satisfies 

^ < (l + o(l))exp(r/2) 

as n — )• oo . 

It then follows that, in terms of order, there is not much difference between the models X 
and X. 
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P x (n^ < nx n < Blogn) >1- 0{nT B l 2 ) 



In particular, since the entries of X are exchangeable, Theorem 2.6 yields the following 
Corollary 2.7. Assume that X is a random doubly stochastic matrix, then 



P(X G S n ) = P(\xij\ < Wlogn/n for all 1 < i,j < n) > 1 - 0(n~ 



Proof. (of Theorem 2.6) It follows from Lemma 2.5 (for r = 1) that 



P(n~ B < nx n < Blogn) < (l + o(l)) exp(l/2)P(n" B < nx x < Blogn), 

where x\ has distribution B(l,n — 1). 
The claim then follows because 



(•B log n 

~P(n- B < nx\ < Blogn) = (n - 1) / (1 - x) n ~ 2 dx 

Jn~ B 

rn~ B 

= l-(n-l)(j( (1 



1 - x) n - 2 da; 



_B log n 



> 1 - 0(n 



□ 



We end this section by giving a proof for the boundedness of Lemma 1 . 



2.8. A proof for Theorem 1.9, We first focus on the random vector x = (xi,...,x r , 



chosen uniformly from the simplex S = jx = (x±, . . . , x n ), < Xi < 1, ^ x» = l|. Because 
each has distribution B(l,n — 1), we have 



EJIxll 



n+1 



(3) 



Also, it can be shown that (for instance from [22^ equation (19)]) 



E x xix 2 



n{n + 1) 



(4) 
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It thus follows from Q that ||x|| = 0(1/ \fn) with high probability. It turns out that this 
probability is extremely close to one. 

Lemma 2.9. Assume that x is sampled uniformly from S and assume that e > is a 
sufficiently small constant. Then there exists a positive constant C > such that 



P(||x|| > C/y/n) < exp(-eVn). 



We assume Lemma 12.91 for the moment. 



Proof, (of Theorem 1.9) First, it follows from Lemma 2.5 (for r = 1) that 



P(x 2 21 + --- + x 2 nl > C/n) < (1 + o(l)) exp(l/2)P(x| + • • • + x\ > C/n) 

= 0(l)P(x? + x 2 + --- + x 2 n > C/n) 



where (x\, x 2 , ■ ■ ■ , x n ) are sampled uniformly from the simplex S. But Lemma 2.9 indicates 
that the RHS is bounded by exp(— tyjn). Thus 



P(x^ + ■••+<!> C/n) = 0(exp(-ev^)) 



(5) 



And SO, clS Xij 3XQ exchangeable, for any j we also have 



P(x 2 2j + --- + x%> C/n) = 0(exp(- ev ^)). 



(6) 



1.9 



The claim of Theorem 



Y17=2 x % ^ C /An for some j 



then follows because X^2<ij<n( x *i — x n) 2 — C would imply 



□ 



It remains to prove for Lemma 2.9. We apply the following concentration result by Paouris. 



Theorem 2.10. [27, Theorem 1.1] There exists an absolute constant c > such that if K 
is an isotropic convex body in R n , then 



P(x G K, ||x|| > cy/nLfct) < exp(— s/nt) 



for every t>l, where Lk is the isotropic constant of K. 



Observe that, by the triangle inequality, for Lemma |2.9| it is enough to give a similar 
probability bound for the event ||x — (1/n, . . . , 1/^)|| > C/y/n. 
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We first shift S to the hyperplane H := {x ; = (x[, . . . , x'^^x^ + ■ ■ ■ + x' n = 0} by the 
translation x = (x\, . . . , x n ) \-t (x\ — 1/n, . . . ,x n — 1/n). We then scale the obtained 
body by a factor a = @(n) to obtain a regular simplex S' of volume one. Elementary 
computations show that this is an isotropic body of bounded isotropic constant. Indeed, if 
x' = (x[, . . . , x' n ) is sampled uniformly from S' and if © = . . . , 6 n ) is any unit vector in 
H, then by Q and @ 



E x , e5 , <x', 0) 2 = E x , e5 , (J2 

i 

= E xe5 J> 2 (J>(* l -i)) 2 

i i 

Y j e 2 l {x l - -f + 2a 2 - -)( Xj - -) 



a 



a 



a 



i+3 



n(n+lj n z ^-^ n(n+l) n z 



(n + 1) 



(^Tn ) E + ^~TZx\ " A )(£ '«) 

n(n + l) ^-r' n(n + l) n z ^-^ 



a 



nin + 1) 



Thus the isotropic constant of S' is of constant order. Theorem 2.10 applied to x' yields 
the following for sufficiently large constant C 



P(x' £ S', ||x'|| > C^fn) < exp(-e v / n). 



Lemma 2.9 then follows because q||x — (1/n, . . . , l/n)\\ = ||x'| 



3. The singularity of X 



In order to justify Theorem 1.10 , one of the key steps is to bound the singularity probability 



of the matrix y/nX — zoI n -\. This problem is of interest of its own. 

We will show the following general result regarding the least singular value cr n -i- 

Theorem 3.1. Let F = (fij)2<i,j<n be a deterministic matrix where \fij\ < n 7 with some 
positive constant 7. Let X be an n x n matrix chosen uniformly from the set of doubly 
stochastic matrices. Then for any positive constant B there exists a positive constant A 
such that 

P(a n ^(X + F)< n - A )<n- B . 



Combine with Theorem |2.7| we obtain the following important corollary which will be re- 
served for later applications. 
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Corollary 3.2. Let F = (fij)2<i,j<n be a deterministic matrix where < n 1 with some 
positive constant 7. Let X = (a?«) be a random doubly stochastic matrix where Xij < 
10 log n/n for all 1 < i,j < n. Then there exists a positive constant A such that 

P(<7„_l(X + F) < n- A ) = 0(n- 3 ). 
Here X is obtained from X in the same way as how X was defined from X. 



We remark that a similar version of Theorem 3.1 had appeared in [34J to deal with random 



matrices of i.i.d. entries (see also [61,125] and the references therein). However, our task here 
looks much harder as the entries in each row and each column are not independent. We 
will now sketch the proof of Theorem 3.1 more details will be presented in Section |4j 



Assume that a n -i(X + F) <n A . Then, by letting C = (cij)2<i,j<n be the cofactor matrix 
of X + F, there exist vectors x and y such that ||x|| = 1 and ||y|| < n~ A and 



Cy = det(X + F)x. 

So 



||C7y|| = |det(A > + F)|. 

Thus by Cauchy-Schwarz inequality, with a loss of a factor of n in probability and without 
loss of generality we can assume that 



^l^f^n^ldet^ + F)! 2 . (7) 
i=2 

In what follows we fix the matrix -XV n _2)x(n— 1) generated by the last (n — 2) rows and the 
last (n — 1) columns of X (equivalently, we fix the last (n — 2) rows of X). 



Let s 2i ■ ■ ■ 1 s n be the column sums of -X^ n _2)x(n— l)- By Theorem |2.6| the probability that 
all £11, . . . , x\ n , X21, ■ ■ • , X2n are greater than n~ 2B ~ 2 is bounded from below by 1 — 0(n~ B ), 
in which case we have 



Si < 1 - n~ 2B ~ 2 for all i > 2, and < si := (n - 2) - (s 2 + • ■ ■ + s n ) < 1 - n~ 2B ~ 2 . (8) 



Thus it is enough to justify Theorem 3.1 conditioning on this event. 



Next, given a sequence S2, ■ ■ ■ ,s n satisfying Q, we will choose X2 ■= X22, ■ ■ ■ ,x n := X2 T , 
uniformly and respectively from the interval [0, 1 — S2], ■ ■ ■ , [0, 1 — s n ] such that 
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si <x 2 + --- + x n < 1. (9) 

The upper bound guarantees that x\ := x 2 \ = 1 — {x% + • • • + x n ) > 0, while the lower 
bound ensures that x\\ = 1 — s\ — x 2 \ = x 2 + • • • + x n — s\ > 0. 



We now express det(X + F) as a linear form of its first row (x 2 — x\ + f 22 , . . . , x n — x\ + /; 



In) 



det(X + F) = c 2j (X + F)( Xj - Xl + f 2j ). 

2<j<n 

By using the fact that x\ = 1 — ^2<j<n x i we can rewrite the above as 

det(X + F) = ( c 2j + E C2i ^ x i + c ' ( 10 ) 

2<j<n 2<t<n 

where c is a constant depending on c 2 j } s and /2/s. 
Observe that 



E 1^+ E C2i i 2= E i c 2ii 2 + ( re + i )i E c ^i 2 ^ E i C2 i 

2<J<n 2<i<n 2<j<n 2<j<n 2<j<n 

Thus, by increasing A if needed, we obtain from Q and (10) the following 



2 



E w + c i - 

2<j<n 



where 

._ C 2j + E2<i<n C 2» 

(^2<j<n l C 2j + E2<i<n C2i| 2 ) 1 / / 



Roughly speaking, our approach to prove Theorem 3.1 consists of two main steps 



(11) 



Inverse step. Given the matrix ^(„_2)x(n-i) f° r which all the column sums Si satisfy 
([8]), assume that 



Px 2 ,...,x n (j E a j x j+ c )\ <n A J> 

2<j<n 

where the probability is taken over all x% , 2 < % which satisfy Q . Then there is a 
strong structure among the cofactors c 2 j of ^( n -2)x(n-i)- 
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• Counting step. With respect to -X7 n _2)x(n-i)> the probability that there is a strong 
structure among the cofactors C2j is negligible. 

We pause to discuss the structure mentioned in the inverse step. A set Q C C is a GAP of 
rank r if it can be expressed as in the form 

Q = {do + kigx + • • • + k r g r \ki E Z, K, L < ki < K i for all 1 < i < r} 

for some (g ,...,g r ) E C r+1 and (iT l5 . . . , K r ), (K{, . . . , K' r ) G Z r . 

It is convenient to think of Q as the image of an integer box B := {(k±, . . . ,k r ) E Z r \Ki < 
fcj < K^} under the linear map $ : (k±, . . . , fc r ) i— ?• 9o + ^151 + • • • + fcrflV- 

The numbers gi are the generators of Q, the numbers K[ and are the dimensions of 
Q, and Vol(Q) := \B\ is the size of -B. We say that Q is proper if this map is one to one, 
or equivalently if \Q\ = Vol(Q). For non- proper GAPs, we of course have \Q\ < Vol(Q). If 
— Ki = K'i for alH > 1 and g$ = 0, we say that Q is symmetric. 

We are now ready to state our steps in details. 

Theorem 3.3 (Inverse Step). Let < e < 1 and B > be given constants. Assume that 



p x 2 ,...,x„ (l a i x i +c )\ < n A )> 

2<j<n 



n- B . 



for some sufficiently large integer A, where a,j are defined in (11), and xj are chosen uni- 
formly from the intervals [0, 1 — Sj] such that the constraint (|9j) holds. Then there exists a 
vector u = (u2, • • ■ , u n ) E C™" 1 which satisfies the following properties. 

• l/2_< 1 1 u 1 1 < 2 and \(u,Ti(X + F)}\ < rr A+ ^ +2 for all but the first row r x (X + F) 
ofX + F. 

• All but n' components U{ belong to a GAP Q (not necessarily symmetric) of rank 
r = Oe,e(l), and of cardinality \Q\ = n° B ^ l \ 

• All the real and imaginary parts ofui and of the generators ofQ are rational numbers 
of the form p/q, where \p\, \q\ < n 2A+3 / 2 . 



In the second step of the approach we show that the probability for -X"( n ,_2)x(n— l) having 
the above properties is negligible. 

Theorem 3.4 (Counting Step). With respect to -X"(n-2)x(n-l)> or equivalently, with respect 
to the last (n — 2) rows of X, the probability that there exists a vector u as in Theorem 
is exp(— Q(n)). 



3.3 



Proof, (of T heor em 3.4) Firstly, we show that the number of structural vectors u described 
in Theorem 3.3 is bounded by n° B ' t ^ + ° A ^ nC \ Indeed, because each GAP is determined 
by its generators and its dimensions, and because all the real and complex parts of the 
genrators are of the form p/q where \p\, \q\ < n 2A+s ^ 2 , there are n° A ' B ' £ ^ GAPs which have 
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rank 0^(1) and size n° B ^ l \ Next, for each determined GAP Q of size n° B ' t<yl \ there are 
\Q\ n = n° B '^ n ^ ways to choose the u\ as its elements. For the remaining 0(n e ) exceptional 
Ui that may not belong to Q, there are n° A ^ n ^ ways to choose them as numbers of the form 
p/q where \p\, \q\ < n 2A+3 / 2 . Putting these together we obtain the bound as claimed. 



Secondly, as for each fixed structural vector u from Theorem 3.3 we have | (u, T{(X + F)) \ 
0{n~ A+ ^ +2 ) for all 2 < i < n - 1. So 



E "' ( Xi i ~ Xil + fv) = \Y1 Xi i ( u i • E " ; ; E "• ' E "//'.; 

2<i 2<j 2<Ai 2<j 2<j 



0{n 



-A+7+2 



). (12) 



We next view this inequality as for the matrix model Y and Y, where Y was introduced in 
Section [2] and Y is obtained from Y in the same way as how X was defined from X, 



E \y^ u i + E nfe ) - E n i + E = °( n A+7+2 )- 



2<i 



2<k 



2<i 2<j 



(13) 



Observe that 



Y i%+ E nfc i 2 ^ E 4>i/4. 



2<j<fc 2<k<n 



2<k<n 



Thus there exits jo such that 



+ E Uk \ - 1 / 2 Vn- 



2<k<n 



It then follows that for each i, with room to spare 



p (! E + E u *) - £ u j • E"-/o = o(n-^+ 2 )) 

2<J 2<fe 2<j 2<j 



-Vijoiujo + E Ufc ) + E -fO'( u J + E «fc) - • • • I = 0(n A+7+2 )|yii,i^ ) 



2<k<n 



2</c<n 



0(n 



-A+7+lCh 



where in the last conditional probability estimate we used the fact that yij are i.i.d expo- 
nentials of mean one. 



Hence, for each fixed structural vector u, the probability P u that (13) holds for all rows 
ri(Y + F), 2 < i < n — 1, is bounded by 
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p u < n (-A+ 7 +10)(n-2)^ 

Summing over structural vectors u, we thus obtain the following upper bound for the 



probability that there exists a structural vector u for which (13) holds for all rows T{(Y + 
F),2 < i < n- 1 



J^P U < n Os - E{ ™ )+o - 4( " e) n ( - A+7+10)(ri " 2) = 0( 



n An / 2 ) 



provided that A is large enough. 



To conclude the proof of Theorem 3.4 we use Lemma 2.2 to pass from Y and Y back to 
X and X. The probability that there exists a structural vector u for which ( |12[ ) holds for 
all rows n(X + F), 2 < i < n - 1, is then bounded by 0(n" An / 2+4n ) = 0( exp(-9(n))) , 
provided that A is sufficiently large. □ 



4. proof of Theorem 13.31 



We recall from the assumption of Theorem 3.3 that 



P X2 ,...,x n (l Yl a X 3 + c l < n ~ A ) > n ~ B ' ( 14 ) 

i>2 

where X2, • ■ . , x n are uniformly sampled from the interval [0, 1— S2], ■ ■ ■ , [0, 1— s n ] respectively 
so that @ holds. 

This is a large concentration of linear form of mildly dependent random variables. Our first 
goal is to relax these dependencies. 



4.1. A simple reduction step. Let E n be the set of all (x2, • • . ,x n ) uniformly sampled 
from [0, 1 - s 2 ] X • • • X [0, 1 - s n ] so that @ holds. We recall from Q that Sj < 1 - n" 2B - 2 . 

Consider the event si < x' 2 + • • • + x' n < 1, where j;^ are independently and uniformly 
sampled from the interval [0, 1 — Si] respectively. 

Note that E(x' 2 + • • • + x' n ) = X^2<j<n(^ — = (1 — *i)/2. Since the random variables 
^ — (1 — Sj)/2 are symmetric and uniform, the density function f(x) of x' 2 + • • • + x' n is 
maximized at (1 — s\)/2 and decreases as \x — (1 — si)/2)| increases. Thus we have 
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P((4, ,..,<)£ E n ) = P( Sl < x' 2 + ■ ■ ■ + x' n < 1) 

f 1 £ f( x ) dx 

= L /( -'' ) ' / -'' = /o (i- a ) + --- + (i— ) f{x)d[ 

1 - Sl _ 1-51 

- (l- S2 ) + --- + (l-s n ) ~ 1 + 
= 17(n- 2B - 2 ), 



where we noted from Q that sx < 1 



n 



-2B-2 



Observe that if we condition on s n < x' 2 + ■ ■ ■ + x' r 
is uniform over the set E n . It thus follows from (14) that 



< 1, then the distribution of 



• • • ! 



X . 



IE 

J>2 



djXj + c\ < n 



> n' 



-3B-2 



(15) 



In the next step of the reduction, we divide the intervals [0, 1 — Si] into disjoint intervals 
In, . . . ,Iiki of length n~^ B ~ 2 , where fcj = (1 — Sj) /n~ 3B ~ 2 (without loss of generality, we 
assume that k{ are integers). Next, to sample x\ uniformly from the interval [0,1 — S{] 
we first choose at random an interval from {In,..., lik} an d then sample x\ from it. By 
this way, (15) implies that there exist intervals 1^,2 < i < n, such that if x\ are chosen 
uniformly from Jj j4 then 



P4,...,< (l Yl a i x 'i + c l ^ n ~ A ) ^ n ^ B ' 2 - ( 16 ) 

i>2 



Observe furthermore that, by shifting c if needed, we can assume that Iij i = [0, n _3B_2 ] for 
all i. Finally, by passing to x'[ := n 3B+2 x' i and by decreasing A to A — (3B + 2), we can 
assume that all x\ are uniformly sampled from the interval [0, 1]. 



4.2. High concentration of linear form. A classical result of Erdos [12] and Littlewood- 
Offord [21] asserts that if bi are real numbers of magnitude \bi\ > 1 , then the probability that 
the random sum Y27=i ^ iXi concentrates on an interval of length one is of order 0(n -1 / 2 ), 
where X{ are i.i.d. copies of a Bernoulli random variable. This remarkable inequality has 
generated an impressive way of research, particularly from the early 1960s to the late 1980s. 
We refer the reader to |18| [20] and the references therein for these developments. 

Motivated by inverse theorems from additive combinatorics, Tao and Vu studied the under- 
lying reason as to why the concentration probability of Y27=i bi x i ° n a short interval is large. 
A closer look at the definition of GAPs defined in the previous section reveals that if bi are 
very close to the elements of a GAP of rank 0(1) and size n olyl \ then the concentration 
probability of Yl?=i t>i x i on a short interval is of order n~°^ l \ where X{ are i.i.d. copies of 
a Bernoulli random variable. 
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It has been shown by Tao and Vu [32j [3U [35] in an implicit way, and by the current author 
and Vu |24j in a more explicit way that these are essentially the only examples that have 
high concentration probability. 

We say that a complex number a is 5-close to a set Q C C if there exists q £ Q such that 
\a — q\ < 5. 

Theorem 4.3 (Inverse Littlewood-Offord theorem for linear forms). |24| Corollary 2.10] 
Let < e < 1 and C > 0. Let f3 > be an arbitrary real number that may depend on n. 
Suppose that hi = 6^2) are complex numbers such that J27=i ll^ll 2 = 1> an d 



sup 



P x (lX^i-a| <fi) =p>n~ c , 



a 

i=l 



where x = (x\, . . . , x n ), and Xi are i.i.d. copies of random variable £ satisfying P(ci < 
^ — £' < C2) > C3 for some positive constants c\,ci and C3 . Then, for any number n' 
between n e and n, there exists a proper symmetric GAP Q = {X^[=i ki9i '■ ki £ Zi,\ki\ < L{\ 
such that 

• at least n — n' numbers b{ are j3-close to Q; 

• Q has small rank, r = Oc, e (l), and small cardinality 



|Q|<max(o c , e (^=),l) ; 



there exists a non-zero integer p = Oc t e{Vn') such that all generators gi = (gn,gi2) 
of Q have the form gij = (3^f, with pij G Z and \pij\ = Oc,e{P ly /rif). 



Theorem 4.3 was proved in [24j with c\ = 1,C2 = 2 and C3 = 1/2, but the proof there 



automatically extends to any constants < c% < C2 and < C3. 



The interested reader is invited to read also |23j,[28j,[39j for other variants and further 
developments of the inverse results. 



We now prove Theorem 3.3 Theorem 4.3 applied to (16), with n' = n € , C = 3B + 2 and 
Xi being independently and uniformly distributed over the interval [0,1], implies that there 
exists a vector v = (v2,--.,v n ) such that 

• |oj — Vi\ < n~ A for all indices i from {2, . . . , n}; 

• all but n' numbers Vi belong to a GAP Q of small rank, r = Og |£ (l), and of small 
cardinality \Q\ = 0(n° B -^ 1 ))- 

• all the real and imaginary parts of Vi and of the generators of Q are rational numbers 
of the formp/g, with p, q G Z and \p\, \q\ = OB,t{n A+l / 2 ). 



Recall that 
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_ c 2j + Y,2<i<n C 2» 

3 (£ 2 <j<n l C2 i + ^2<Kn C 2i \ 2 ) l l 2 ' 

We will translate the above useful information on Oj's to c/s. To do so we fist find a number 
of the form p/n A , where p G Z and — n A < p < n A such that 



1 



' n ^ (E,l^ + E 2 < J <„c 2l | 2 ) 1 /2| 

Thus, by shifting the GAP Q by p/n A , we obtain |o^- — < 2n~ A , and so 

||a'-v'|| = 0(n- A+1 ^ 2 ), 
where a' = (a' 2 , . . . , a'J, v' = (v' 2 , ...,v' n ) and 



a' = 



C 2j 11 / P 

as well as Va = va — 



^(E.h.+E^c^) 1 / 2 ' J " ^ n A ' 

By definition, l/2n 2 < ^ |a^| 2 < 1, so by the triangle inequality 

||v'|| > ||a'|| - 0(n~ A+l ' 2 ) > 1/V2n - 0( n - A+1 / 2 ) 

and 



||v'|| < ||a'|| + 0(n- A+l ' 2 ) < 1 + 0{n- A+l l 2 ). 

More importantly, as a' is proportional to (c 22 , • • • , c 2n ) (which are the cofactors of X + F), 
a' is orthogonal to all but the first row of X + F. In other words, |(a', Ti(X + F))\ = for 
all i > 2. It is thus implied that 



\{V,Ti(X + F))\ <n~ A ^ +1 . 

In the last step of the proof, we find nonzero numbers p',q' G Z, \p'\, \q'\ = 0{n) so that 
KH/2 <p'/q' < 2||v'||. 



Set 
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we then have 

• 1/2 < ||u|| < 2 and (u, r^X + F)) < n~ A+ ^ 2 for all but the first rows of X + F; 

• all but n' components Ui belong to a GAP Q' (not necessarily symmetric) of small 
rank, r = Ob £ (1), and of small cardinality \Q'\ = 0(n° s ' E ^ 1 )); 

• all the real and imaginary parts of ui and of the generators of Q' are rational numbers 
of the formp/g, with p, q S Z and \p\, \q\ = OB i£ (n M+3 / 2 ). 



5. Spectral concentration of i.i.d. random covariance matrices 



From now on we will mainly focus on the bounded model X rather than on X. This is the 
model where we can relate to Y, a matrix of bounded i.i.d entries (defined in Section [2J for 
which concentration results may easily apply. Furthermore, by Corollary |2.7[ there is not 
much difference between the two models X and X. 



Having learned from Corollary 3.2 that | det(y/nX — zo/ n _i)| is bounded away from zero, 



we will show that ^log\ det(^/nX — zo^n-i)| is well concentrated around its mean. This 



result will then immediately imply Theorem 1.10 



In order to study the concentration of det(^/nX — zoI n -i), we might first relate it to the 
counterpart Y. However, the entries of the later model are not independent, and so certain 
well-known concentration results for i.i.d matrices are not applicable. To avoid this technical 
issue, we will modify \/nX as follows. Observe that 



&et{yJnX - z I n -l) = —?= det( v / n^(„-i)xn - F zo ), (17) 

V Tl 



where F zo is the deterministic matrix obtained from zo-^n-i by attaching (—^fn, . . . , —y/n) 
and (— y/n, 0, . . . , 0) T as its first row and first column respectively, and X^ n _^ xn is the 
matrix obtained from X by replacing its first row by a zero vector, 



nx 2 i y/nx 2 2 - z 



n 
nx 2r 



\y/nx n i ^nx n2 



n x. 



As it turns out, it is more pleasant to work with ^( n _i) X n because the entries of its coun- 
terpart Yf n _i\ Xn are now independent. To relate the singularity of \/nX — zoI n ~i to that 
of \A^(n-i)xn — F Zo , we have a crucial observation below. 
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Claim 5.1. Suppose that A is sufficiently large constant. We have 

_ y 1 - 

<T n WnX{n-i)xn- Fzo) > ~ min (—^=cr n -i(y/nX - ZQl n -i) - 0(n~ A ), n~ A ) . 

n V2n 

To prove this claim, let ci, — , c n be the columns of \/nX^ n _^ xn — F ZQ . Let v = (vi, ■ ■ ■ , v n ) 
be any unit vector. If \vi+- ■ -+v n \ > n _j4_1 / 2 , then it is clear that || (\/nJ( n _i) xn - ^io) v ll — 
\y/n(v\ + • • • + v n )\ > n~ A . Otherwise, as |t>i| 2 + • • • + \v n \ 2 = 1, we can easily deduce that 
\v2\ 2 + • • • + \v n \ 2 > l/2n. Next, by the triangle inequality, 



/ nX (n _ 1 ) xn - F zo )v\\ = || ViCi\\ = || Vi(ci - ci) + («i H h v n )c 

2<i<n 2<i<n 

> || WiCill-n-^^llciH 



2<i<n 



n A 



Claim 5.1 



Corollary 



> {\v 2 \ 2 + ■■■ + \v n \ 2 ) l l 2 a n ^{^/nX - zo/„_i) - v^' 

> -^ffn-liM " zo4-l) - 0(n- A ). 
V 2n 



gua rantees that the polynomial probability bound for a n -i{^/nX — ZQl n —i) from 
continues to hold for o" n (y / nX( n _ 1 w ri — F Zo ) (with probably worse A). 



3.2 



Theorem 5.2. There exists a positive constant A such that 

P(^(M(n-i)x„ " ^o) < n~ A ) = 0(n- 3 ). 

Our goal is then to establish a large concentration of - log | det(y / nX( n _ 1 ) xn — F zo )\ around 
its mean. We now pass to consider Y . 

5.3. Large concentration for Y. Consider the i.i.d matrices Y defined from Section [ij 
and let ^( n _i) xn be the matrix obtained from Y by replacing its first row by the zero vector. 

We first observe from Claim loTTI that 



ffn(4%-l)xn - Fzo) > - min {-^= a n-l(A=y ~ Z I n --l) - 0(n A ),n A ) , 

\in n -J In v n 



where 



^(n-l)xn Fzo 



( 



n 



2/22 - Zq 



7S J/2r. 
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On the other hand, conditioning on §21, ■ ■ ■ ,y n i, the entries yij — yn of the matrix Y are 
independent, and so we can apply known singularity bounds, for instance [3TJ Theorem 
2.1], for i.i.d matrices to conclude that for any positive constant B, there exists a positive 
constant A such that P(o" n _i(^Y — Zo-^n-l — n~ A ) = 0(n~ B ). Returning to V( n _i)xm we 
hence obtain the following. 

Theorem 5.4. For any positive constant B, there exists a positive constant A such that 

p(a n (-±=Y {n _ 1)xn - F zo ) < n- A ) = 0( n - B ). 



This bound will be exploited later on. 

Next, let H denote the following Hermitian matrix 



H '■ — ( , — Y( n _ x)xn Fzo) ( 1 — Yi n _ x)xn F ZQ 
\ n \ n y ' 



It is clear that the eigenvalues Ai(iJ), . . . , X n (H) of H can be written as 



Al(-ff) — 0-\{—j=Yi n -l)xn - Fzq)-, ■■■> ^n(H) — C^(^^?n-l)xn ~ F ZQ ), 

\ n \ n y ' 



where o-i(A=Y {n ^ 1)xn - F Zo ) are the singular values of -^Y {n _ l)xn - F Z() . 
The following concentration result will serve as our main lemma. 

Lemma 5.5. Assume that f is a function so that g{x) := f(x 2 ) is convex and has finite 
Lipshitz norm \\g\\L- Then for any 5 > CK\\g\\L/n, where K = lOlogn is the upper bound 
for the entries 0/ Y( n _i) xn and C is a sufficiently large absolute constant, we have 



|^/(A,(if))-E(^/(MiO))| >6nj =O^M-C'^^)j 
here C and the implied constant depend on C . 



Remark that when F zo vanishes, Lemma 5.5 is essentially \17\ Corollary 1.8] of Guionnet and 



Zeitouni. We will show that the method there can be easily extended for any deterministic 



matrix F ZQ . 



Proof, (of Lemma 5.5) Consider the following Hermitan matrix Km of size 2n x 2n 



K 2n 



(^^(n-ljxn Fzoj 
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Apparently, 



Jx 2n — 



( ^X{n— l)xn Fzo)(^/nY(n-l)xn F zo ) 



So to prove Lemma 5.5, it is enough to show that 



(2n 2n \ , 2 r2 \ 

\Y,9(\(K2n)) ~ E(J29(h(K 2 n)))\ > 25n = O exp(-C"-^— ^) , (18) 
i=i i=i J ^ \\9\\l J 

where \i{K 2 n) are the eigenvalues of -ft^n- 

Next, by following [171 Lemma 1.2] we obtain the following. 

Lemma 5.6. The junction M i— > tr(g(-^M+F)) of Hermitian matrices M = (mij)i<ij< n , 
where F is a deterministic Hermitian matrix whose entries may depend on n, is a 

• convex function; 

• Lipschitz function of constant bounded by 2\\g\\L. 



We refer the reader to Appendix [A] for a proof of Lemma 5.6 To deduce (18) from Lemma 



5.6, we apply the following well-known Talagrand concentration inequality |29j . 



Lemma 5.7. Let D be the disk {z £ C, \z\ < K}. For every product probability fi in D N , 
every convex function F : C N \-t R of Lipschitz norm \\F\\l, and every r > 0, 



P(\F - M(F)\ >r)< 4exp(-r 2 /16i^ 2 ||F|||), 
where M(F) denotes the median of F. 

Indeed, let F be the function : Y' H- tr(g(K 2n )) = tr(g(^Y' + F')), where 



(n y* 
u J (n-l)xn 
^(n-l)xn 



and 
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Observe that the entries of Y' are supported on \x\ < K = 10 log n. By Lemma 5.6, F is 



convex function with Lipschitz constant bounded by 2\\g\ £. The conclusion (18) of Lemma 



5.5 then follows by applying Lemma 5.7 



□ 



In what follows we will apply Lemma 5.5 for two functions, one gives an almost complete 



control on the large spectra of H and one yields a good bound on the number of small 
spectra of H. We will choose c to be a sufficiently small constant, and with room to spare 
we set 



e = 5 = 9(n" 



5.8. Concentrati on o f large spectra for i.i.d matrices. Following |10| and |13j . we 
first apply Lemma 5.5 to the cut-off function f e (x) := log(max(e, x)). Note that f e (x 2 ) has 
Lipschitz constant 2e -1 / 2 . Although the function is not convex, it is easy to w rite it as a 
difference of two convex functions of Lipschitz constant 0(e -1 / 2 ), and so Lemma 
because 5 = Q(n~ c ) > Ce l / 2 K/n. 

Theorem 5.9. We have 



5.5 



applies 



53 logo-i(^=y" (n _i)x n - F Z0 ) - E( 53 lo S^(---)) 



> 5n 



= 0(exp(-n 2 5 2 e/K 2 )) = 0(exp(-n log 2 n)), 
where S e := {x £ R, x > e}. 

For short, from now on we set 



(n-l)xn 



*!*)). 



Serving as the main term, h -y (zq) will play a key role in our analysis. In our next 



subsection we apply Lemma 5.5 to another function / 



5.10. Concentration of the number of small eigenvalues for i.i.d matrices. Let I 

be the interval [0,e]. We are going to show that the number Nj of the eigenvalues \i(H) 
which belong to I is small with very high probability. 
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It is not hard to construct two functions /i, /2 such that (/i — f<z) — 1/ is non-negative and 
supported on an interval of length e/C, and so that both of gi(x) = fi(x 2 ) and g2{x) = 
f2(x 2 ) are convex functions of Lipschitz constant 0(e~ 1 ^ 2 ). (For instance one may construct 
fi(x), f2{x) in such a way that the even function gi(x) = fi(x 2 ) is identical to 1 on the 
interval [e 1//2 , e 1//2 ] and being straight concave down from both edges with a slope of 0(e -1 / 2 ), 
while the graph of the function 52 (x) = f2(x 2 ) is obtained from that of gi(x) by replacing 
its positive part with zero). 



Next, by Lemma 5.5 we have 



p 1 1 J2 h(\m)-E(j2 mm)) 

\i(H) Xi(H) 



> 5n 



O ( exp(— n log 2 n)) , 



and 



P[ I Y, f2(K(H))-E(Y f2(Xi(H))\>5n\ = 0(exp(-nlog 2 n)). 

Xi{H) \i(H) 



By the triangle inequality, we thus have 



P(| ^(/i-/2)(A i (if))-E(^(/ 1 -/ 2 )(A i (if)))|>25n) =0(exp(-nlog 2 n)). 
Xi(H) \i(H) 



Because the error-function / = (/1 — f2) — 1/ is nonnegative, it follows that with probability 
1 — 0(exp(— nlog 2 n)) 



Y li(\(H)) + Y f(WH)) < E ( E (A " h)^(H))) + 2*n, 

\i(H) \i(H) \i{H) 

and hence 



N I= Y lj(Ai(H)) < E( E (/1 - / 2 )(Ai(i?))) + 25n 
Xi(H) Xi(H) 

< 2E( lj(Ai(#))) + 2$n 

< 2E(iVj) + 25n, 
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where J is the interval [0, e + e/C] and Nj is the number of eigenvalues of H in J. (Strictly 
speaking, we have to set J = [— e/C, e + e/C]. However, as Aj are non- negative, we can omit 
its negative interval.) 

To exploit the above information furthermore, we apply a result saying that Nj has small 
expected value (see also (37J Proposition 28] and the references therein). 

Lemma 5.11. For all J C R with \ J\ > K 2 log 2 n/n 1 / 2 , one has 

Nj <C n\J\ 

with probability 1 — exp(— w(logn)). In particular, 

E(Nj) < Cn\J\, 
where C is a sufficiently large constant. 



Remark that this result holds for any deterministic matrix Fq in the definition of H. We 



defer the proof of Lemma 5.11 to Appendix [Bj 

In summary, we have obtained the following result. 
Theorem 5.12. With probability 0(exp(— ralog 2 n)) , we have 



Nj > 2Cen + 25n, 



where Nj is the number of a i (^ i Y^ n _ 1 - )xn - F Zo ) such that <J 2 {^Y {n _ 1)xn - F ZQ ) £ [0, e 



Consequently, it follows from Theorems 
following holds 



5.4 



and 



5.12 



that with probability 1 — 0{n B ) the 



1 

n 



2 log (Ti(-=y" (n _ 1)xn -F Z0 ) = O((e + 5) log n) = 0(n c logn). 



<T l 2 (^y (n _ 1)xn -F 20 )e[0,e] 



Thus, combining with Theorem 5.9, we infer the following 



Theorem 5.13. Let zq be fixed and let B be a positive constant. Then the following holds 
with probability 1 — 0{n~ B ) 



1 



n 



log] det(— =Y, 



n 



(n-l)xn Fzqj 



<25 + 0(n c logn) = 0(n c logn) 



where the implied constants depend on B. 
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5.14. Asymptotic formula for h v> ( z o)- We next claim that — log I det (-7= Y( n _n xn - 

E > J (»i-l)X7l n 

F zo )\ also converges to the corresponding part of the circular law, and so giving an asymp- 
totic formula for h v> (zn). 

Theorem 5.15. For almost all zq, the following holds with probability one 



11- f 

-log det(-=y( n _i) Xfl -i^ )|- / log \w-z \dfi cir (w) = o(l). 
n Jn Jc 



(19) 



Note that this result is more or less a circular law for random matrices of i.i.d. entries. To 
prove it we just simply rely on |34j . 



Proof, (of Theorem 5.15) We first pass to Y 



Y 



( 2/22 - mi ■■ 

2/32 — 2/31 • ' 
\Vn2 -Vnl ■■ 



■ 2/2n ~ 2/21 \ 
• 2/3n - 2/31 

Unn 2/nl f 



where ytj are i.i.d. copies of y. 
As 



det(^=Y( n _ 1)xn - F zo ) = Jndet(—^=Y - z I n -i), 
Jn K ' Jn 



it is enough to prove the claim for det(^Y — zo-fn-i)- 

View Y as a sum of the matrix (yij)2<i,j<n and R, the (n — 1) x (n — 1) matrix formed 
by (—yn, . . . , —yu) for 2 < i < n. Because R has rank one and the average square of its 
entries ^ziYliVii ls bounded almost surely (with respect to 2/21, • • • , 2/ni)> [3H Corollary 
1.15] applied to Y implies that the ESD of -^Y converges almost surely to the circular law. 

Finally, thanks to [341 Theorem 1.20], for almost all zq the following holds with probability 
one 



- log I det(^y - z I n -i] 
n 1 Jn 



log \w - z \du cir (w) = o(l) 



□ 



Theorems 5.13 and 5.15| immediately imply that for almost all zq 
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(n-l)Xn 



(zq) - / log \w - z \dfi cir (w) = o(l). 



(20) 



By substituting (20) back to Theorem 5.9, we have 



\ <(7H y (n-l)X„-^o)GS e 



Y\ logtTi(-^=Y( n _ 1 ) xn - F 2o ) - / log|to - ^ol^cirM >5 + o(l) 



0(exp(— nlog 2 n)). 



(21) 



6. Large concentration for X, proof of Theorem 1.10 



In this section we will apply the transference principle of Lemma |2.3| to pass the results of 
Section [5] back to X. Our treatment here is similar to [Hi Section 4]. 



By Lemma |2.3| and (21), conditioning on Y G D n we have 



n 



E 



logo-j 



1 



\ a i(^/E Y {n-l)xn-Fz () )eS e 



^(n-l)xn -^20/ 



L 



\og\w - ZQ\dn ciT (w) > 5 + o(l)\Y G D r 



0(n iUri exp(-nlog 2 n)) = 0(exp(-n log 2 n/2)). 



(22) 



Next, for each Y G D n we will compare the singular values of ^j=Y(n—x)xn ~ Fz Q with those 
of \/n^(n-i)xn ~~ Fz i where X is determined by <J?(^Y), i.e. Xij = \y%j for all 2 < i, j < n. 

By definition, as Y G Z) n , we have \^y~n — x%i\ < n -4 , and so the operator norm of the 
difference matrix is bounded by 



< 



n- 



This leads to a similar bound for the singular values for every i (see for instance [19J) 



(n— l)xn 



i^o) - ^(M(n-l)xn - ^20) 



< 



(23) 



Notice furthermore that, conditioning on Y G D n , $(— Y) is uniformly distributed on the 
set 5 n of bounded doubly stochastic matrices X. Thus, by a slight modification of e by 
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an amount of n 2 (thus the order of e remains 0(re c )), we obtain from (22) the following 
upper tail bound with respect to X 



(l 

logffi(v^X(„_i) x „ - F & 

\ ff, 2 (M(»-l)x»-^o) 6S f+ »-2 



/ \og\w — ZQ\dn C ir{w) > 8 + o(l) 
Jc 



= 0(exp(— relog 2 n/2)). 
Also, we obtain a similar probability bound for the lower tail 



o- 1 2 (VnX (n _ 1)xn -F Z0 )e5 E _ n _2 



\ 



logai(^X( n _ 1)xn - F Zo ) - I log\w - z \dn cir (w) < -(5 + o(l)) 



J 



= 0(exp(— re log 2 re/2)). 
Notice that these bounds hold for any e = 0(n~ c ). By gluing them together we infer the 



following variant of (22). 



Theorem 6.1. With respect to X we have 
( 



^ ^ ^ogai(^/nX {n _ 1)xn - F zo ) - J log \w - z \dfj, cir (w] 

y cr i 2 ( V / "^(n-l)xn-- F z ) e5E 



>S + o(l) 



0(exp(— re log 2 re/2)). 



Next, conditioning on Y € -D n , by Theorem 
0(exp(— re log 2 n/2)) we have 



5.12 



and Lemma 



2.3 



with probability 0(n exp(— n log 



iVj > 2Cen + 2<5re, 

where JV/ is the number of crj(-^=Y" (n _ 1)xn - F Z() ) such that of (-j^(n-i)xn - F zo ) £ [0, e]. 

Because is uniformly distributed on the set S n conditioning on Y 6 and also 

because of (23), we imply the following. 



Theorem 6.2. With probability 0(exp(— relog 2 re)) with respect to X, we have 



Ni > 2C(e + -~)n + 26n, 



n- 
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where Nj is the number o/(Tj(y / nl( n _ 1 ) xn — F Zo ) such that (T 1 2 (v'nX( n _ 1 ) xn — F ZQ ) £ [0, e] 
We now gather the ingredients together to complete the proof of our main result. 



Proof, (of Theorem 1.10 for X) By Theorems 5.2 and 6.2 we have that 



n 



E 



log0i(V^X ( „_ 1)xn - F ZQ ) = 0((e + 5) logn)) = 1 - 0{n^) 



», 2 (M»-l)xn-^ ) e [M 



A combination of this fact with Theorem 6.1 implies that for almost all Zq, 



1 



n 



log | det( v / nX( n _ 1)xn - F Zo ) - log \w - zo\dfi cir (w) = o(l) ) = 1 - 0(n d ). 



Hence, by (17), 



^ log | det(Vn^ - ^O-fn-l) - J log |/r - -(,!(///,,;,.('('! — o( 1 ) ) ■ - I -()(// '"). 



completing the proof. 



□ 



Appendix A. Proof of Lemma 15.61 



The main goal of this section is to justify Lemma 5.6 Although our proof is identical to 
\17\ Theorem 1.1] and |X 7|, Corollary 1.8], let us present it here for the sake of completeness. 



A.l. Convexity. For simplicity, we first show that the function M t— >■ tr(<?(M + F)) is 
convex. It then follows that the function M i-> tr(g(^M + F)) is also convex. 

For any Hermitian matrices U and V 



+ F)-g(U + F) = Dg(u + F + V (V - [/)) $(V - U)dq 



where 



Dg(U + FWV) = lim e" 1 [g(U + F + eV) - g(U + F) 
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For polynomial functions g, the non-commutative derivation D can be computed and one 
finds in particular that for any p£N, 



{V+F) P -{U+F) P = (^2(U + F + r,{V - U)) k {V - U)(U + F + r]{V - [/)) P ^^ dr]. 

(24) 

For such a polynomial function, by taking the trace and using tr(AB) = tr(BA), one 
deduces that 



.U + V 



tT((U + Fy)-tr((^— + F) p ) =p / tr ( (— ^— + F + V 



U + V 



u-v^u-v 



dr], (25) 



u + v.^ r\ f,U_ + V u-y y-u 

2 ' 2 ' 2 



tr((F + F) p )-tr((^— + F) P ) =p / tr 
^ Jo 



dr]. (26) 



It follows from (24), (25) and (26) that 



U + V 



A := tr((U + F) p ) + tr((V + F) p ) - 2tr((^— — + F) p ) 



P- 2 rl rl 



J2 [ [ r]dr]detr((U-V)Z^ e (U-V)Z: 
k=0 Jo Jo v 



p-2-k 



(27) 



with 



E±X_ + F — r] 1 ^—^- + rfl(U - V). 



Next, for fixed r],9 £ [0, l] 2 , and fixed U,V,F Hermitian matrices, is also Hermitian, 
and so we can find a unitary matrix U T) _q and a diagonal matrix D^ q with real diagonal 
entries A^e(l), . . . , X Vj g(n) so that 



Let W Vt0 = U Vt6 = U*/U - V)U qfi . Then 
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A = fE f f\dvdet,{w rhd D^ e W^D p - 2 - k 
k=o Jo Jo 

Y^f [\d v d9Y^ E ^e(i)K~e 2 ~ k tiW v ,em 2 - (28) 
,._„jo Jo 



fc=0 u u fc=0 l<i J<n 

But 



E A^(0A^ fc (i) = 5' g r? = {p ~ l) t ( aX ^ (j) + (1 " <*)KoV)) V * da. 

v x vAv - KAj) Jo v j 



Hence, substituting in (28) gives, 



A=J V / f [ da V dr,de\W v , e (ij)\ 2 g'\a\ v . e (j) + (l-a)\ v , e (i))>0 (29) 

l<iJ<n J ° J ° J ° 

for the polynomial g{x) = x p . 

Now, with U, V, F being fixed, the eigenvalues X Vi q(1), . . . , A^^(n) and the entries of W Vt g 
are uniformly bounded. Hence, by Runge's theorem, we can deduce by approximation that 



(29) holds for any twice continuously differentiable function g. As a consequence, for any 



such convex function, g" > and 

A = tv(g(U + F)) + tv(g(V + F)) - 2tr( 5 (^±I + F)) > 0. 

A. 2. Boundedness. Now we show that the function M i— > tr(g(^M + F)) has Lipschitz 
constant bounded by 2||<?||l. 

First, for any bounded continuously differentiable function g we will show that 

E (d^ Xij) tT(g(^=M + F))f + Y, (d^ Xlj) tr(g(j=M + F))y <4\\gf L . 



in / f — ' \ " '■" \/n 



We can verify that 



where Aij(kl) = 1 if kl = ij or j'i and zero otherwise. 



(30) 
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Indeed, (30) is a consequence of (24) for polynomial functions, and it can be extended for 



bounded continuously differentiable functions by approximations. In other words, we have 



d mXt3 )tT(g{-j=M + F)) 



7^ \9'{^M + F)(ij) + g'(^M + F)(ji) ) i * j; 



^'(±M + F)(U) 



i = j. 



Hence, 



E (^)tr(<7(^M+F))) 2 < lY.y(^ M + F )m 2 = hr( 9 \j=M + F)g'(j=M + Fy 

i,j V i,j V V V 



But if Ai, . . . , \ n denote the eigenvalues of ^j^M + F then 

tr L'( _L M + F)g'( ±= M + F)*) = i^( 5 '(A,)) 2 < \\g' 



|2 

loo" 



Thus we have 



E(^) tr (5(-^M + F))) 2 <2||^ 



/||2 
oo • 



The same argument applies for derivatives with respect to Q(xij), and so by integration by 
parts and by Cauchy-Schwarz inequality 



tv(g(^-U + F)) - ti(g(^-V + F)) 

'n Jn 



< 2\\g\\ L \\U -V\\ 



for any U and V . 

Observe that this last result for bounded continuously differentiable function g naturally 
extends to Lipschitz functions by approximations, completing the proof. 



Appendix B. Proof of Lemma [5.111 

Note that if F zo vanishes then this is |37} Proposition 28] (see also [2]). We show that the 
method there extends easily to any deterministic F Zo . 



Assume for contradiction that 



\Nj\ > Cn\J\ 
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for some large constant C to be chosen later. We will show that this will lead to a contra- 
diction with high probability. 

We will control the eigenvalue counting function Nj via the Stieltjes transform 



1 n 1 
S(Z) := XAH) - z 

Fix J and let x be the midpoint of J. Set rj := \ J\/2 and z := x + irj, we then have 

5 r/n 

Hence, 



9f(a(z)) > C. (31) 

Next, with H' := (^(Y) - F Z0 )(^(Y) - F zo f = ±MM* where M := $(f ) - y^o, 
we have (see also [2j Chapter 11]) 



k<n 



1 



where h' kk is the kk entry of H'\ Hi is the n — 1 by n — 1 matrix with the fc-th row and A;-th 
column of H' removed; and a k is the k-th column of H' with the k-th entry removed. 



Note that < cj^y, one concludes from (31) that 



ly 1 c 



n 

k<n 



By the pigeonhole principle, there exists k such that 



' > C. (32) 



r, + Z(a* k (H k -zI)-i ak ) 



Fix such k, note that 



a k = *M fc 4, and H' k = ^M k M* k 
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where = r&(M) and is the (n — 1) x n matrix formed by removing r^(M) from M. 
Thus if we let Vi = vi(M fe ), . . . ,v„_i = v n _i(M fe ) and ui = m(M fc ), . . . , u n _i = u„_i(M fc ) 
be the orthogonal systems of left and right singular vectors of and let Aj = Xj(H' k ) = 



^<jj(Mk) be the associated eigenvalues, one has 



a* k (H f k - ziy 1 ^ = ]T 



la*vl 2 



Aj — Z 

l<?'<n-l J 



Thus 



la*vl 2 



. . r] 2 + \Xj - x\ 2 ' 

l<j<n— 1 J ' 



We conclude from ( 32 ) that 



la*vl 2 



i ?7 2 + | Aj — x| 2 Cr/ 



l<i<n-i 



Note that atv,- can be written as 



a fcVj = r k Uj. 

Next, from the Cauchy interlacing law, one can find an interval L C {1, . . . , n — 1} of length 

\L\ > Cr]n 

such that Xj £ L. We conclude that 



Since Xj G J, one has <7j = 0(^/n), and thus 

^|r feUj | 2 «^. 

The LHS can be written as ||-7iy(r£) || 2 , where V is the span of the eigenvectors Uj for j G L 
and 7iy(.) is the projection onto V. But from Talagrand inequality for distance (Lemma 
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B.l below), we see that this quantity is S> r/n with very high probability, giving the desired 



contradiction. 

Lemma B.l. Assume that V C C n is a subspace of dimension dim(y) = d < n — 10. Let 
f be a fixed vector (whose coordinates may depend on n). Let y = (0, 2/2, • • • , Hn)> where 
y = jji — 1 and y\ are i.i.d. copies of y defined from (|2j). Let a = 0(1) denote the standard 
deviation of y and K = 10 log n denote the upper bound of y, then for any t > we have 



p y (W(y + f ) > V2aVd/2 - O(K) - t) > 1 - o( exp( 



t 2 



We now give a proof of Lemma B.l It is clear that the function (7/2, ■ • • , Un) ^ ^^(y + f) 



is convex and 1-Lipschitz. Thus by Theorem 5.7 we have 



P y (JMy + f)-M(My + f))| >t) =0[ exp(-m 2 /K 2 )). (33) 
Hence, it is implied that 

Py,y' (kv(y + f) + MY' + f) - 2M(vr y (y + f))| < 2t) = (l - 0(exp(-16t 2 /^ 2 ))^ 

= 1 -0(exp(-16t 2 /^ 2 )), (34) 

where y' is an independent copy of y. 

On the other hand, by the triangle inequality 

7rv(y + f) + 7iy(y' + /) > ir v (y - y'). 

Applying Talagrand inequality once more for the random vector y — y' (see for instance 
Lemma 68]), we see that 



Py,y' (K(y - y') - V2aVd\ > t) = o(exp(-t 2 /16i^ 2 )) . 

Thus, 

p y,y'(My) +n V (y') > V2aVd-t^j = 1 - o( exp(-t 2 /16K 2 )) . 



By comparing with ( 34 ) , we deduce that 

M{ir v (y + f)) > y/lj2aVd - O(K). 
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Substituting this bound back to (34), we obtain the one-sided estimate as desired. 
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